Parameter-wise co-clustering for high-dimensional data
نویسندگان
چکیده
In recent years, data dimensionality has increasingly become a concern, leading to many parameter and dimension reduction techniques being proposed in the literature. A parameter-wise co-clustering model, for (possibly high-dimensional) modelled via continuous random variables, is presented. The although allowing more flexibility, still maintains very high degree of parsimony interpretability achieved by traditional co-clustering. More precisely, keystone consists dramatically increasing number column-clusters while expressing each as combination limited mean-dependent variance-dependent column-clusters. stochastic expectation-maximization algorithm along with Gibbs sampler used estimation an integrated complete log-likelihood criterion model selection. Simulated real datasets are illustration comparison
منابع مشابه
Hierarchical Information-theoretic Co-clustering for High Dimensional Data
Hierarchical clustering is an important technique for hierarchical data exploration applications. However, most existing hierarchial methods are based on traditional one-side clustering, which is not effective for handling high dimensional data. In this paper, we develop a partitional hierarchical co-clustering framework and propose a Hierarchical Information-Theoretical Co-Clustering (HITCC) a...
متن کاملA Hierarchical Probabilistic Model for Co-Clustering High-Dimensional Data
We propose a hierarchical, model-based co-clustering framework for handling high-dimensional datasets. The technique views the dataset as a joint probability distribution over row and column variables. Our approach starts by initially clustering rows in a dataset, where each cluster is characterized by a different probability distribution. Subsequently, the conditional distribution of attribute...
متن کاملModel-based Co-clustering for High Dimensional Sparse Data
We propose a novel model based on the von Mises-Fisher (vMF) distribution for coclustering high dimensional sparse matrices. While existing vMF-based models are only suitable for clustering along one dimension, our model acts simultaneously on both dimensions of a data matrix. Thereby it has the advantage of exploiting the inherent duality between rows and columns. Setting our model under the m...
متن کاملHigh-dimensional data clustering
Clustering in high-dimensional spaces is a difficult problem which is recurrent in many domains, for example in image analysis. The difficulty is due to the fact that highdimensional data usually live in different low-dimensional subspaces hidden in the original space. This paper presents a family of Gaussian mixture models designed for highdimensional data which combine the ideas of subspace c...
متن کاملAttribute Selection for High Dimensional Data Clustering
We present a new method to select an attribute subset (with few or no loss of information) for high dimensional data clustering. Most of existing clustering algorithms loose some of their efficiency in high dimensional data sets. One possible solution is to use only a subset of the whole set of dimensions. But the number of possible dimension subsets is too large to be fully parsed. We use a he...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computational Statistics
سال: 2022
ISSN: ['0943-4062', '1613-9658']
DOI: https://doi.org/10.1007/s00180-022-01289-2